66 research outputs found

    Complete genome sequences and genomic characterization of five plasmids harbored by environmentally persistent Cronobacter sakazakii strains ST83 H322 and ST64 GK1025B obtained from powdered infant formula manufacturing facilities

    Full text link
    Background: Cronobacter sakazakii is a foodborne pathogen that causes septicemia, meningitis, and necrotizing enterocolitis in neonates and infants. The current research details the full genome sequences of two extremely persistent C. sakazakii strains (H322 and GK1025B) isolated from powdered infant formula (PIF) manufacturing settings. In addition, the genetic attributes associated with five plasmids, pH322_1, pH322_2, pGK1025B_1, pGK1025B_2, and pGK1025B_3 are described. Materials and Methods: Using PacBio single-molecule real-time (SMRT¼^{¼}) sequencing technology, whole genome sequence (WGS) assemblies of C. sakazakii H322 [Sequence type (ST)83, clonal complex [CC] 83) and GK1025B (ST64, CC64) were generated. Plasmids, also sequenced, were aligned with phylogenetically related episomes to determine, and identify conserved and missing genomic regions. Results: A truncated ~ 13 Kbp type 6 secretion system (T6SS) gene cluster harbored on virulence plasmids pH322_2 and pGK1025B_2, and a second large deletion (~ 6 Kbp) on pH322_2, which included genes for a tyrosine-type recombinase/integrase, a hypothetical protein, and a phospholipase D was identified. Within the T6SS of pH322_2 and pGK1025B_2, an arsenic resistance operon was identified which is in common with that of plasmids pSP291_1 and pESA3. In addition, PHASTER analysis identified an intact 96.9 Kbp Salmonella SSU5 prophage gene cluster in pH322_1 and pGK1025B_1 and showed that these two plasmids were phylogenetically related to C. sakazakii plasmids: pCS1, pCsa767a, pCsaC757b, pCsaC105731a. Plasmid pGK1025B_3 was identified as a novel conjugative Cronobacter plasmid. Furthermore, WGS analysis identified a ~ 16.4 Kbp type 4 secretion system gene cluster harbored on pGK1025B_3, which contained a phospholipase D gene, a key virulence factor in several host–pathogen diseases. Conclusion: These data provide high resolution information on C. sakazakii genomes and emphasizes the need for furthering surveillance studies to link genotype to phenotype of strains from previous investigations. These results provide baseline data necessary for future in-depth investigations of C. sakazakii that colonize PIF manufacturing facility settings and genomic analyses of these two C. sakazakii strains and five associated plasmids will contribute to a better understanding of this pathogen's survival and persistence within various “built environments” like PIF manufacturing facilities

    The complete mitochondrial genome of the foodborne parasitic pathogen Cyclospora cayetanensis

    Get PDF
    Cyclospora cayetanensis is a human-specific coccidian parasite responsible for several food and water-related outbreaks around the world, including the most recent ones involving over 900 persons in 2013 and 2014 outbreaks in the USA. Multicopy organellar DNA such as mitochondrion genomes have been particularly informative for detection and genetic traceback analysis in other parasites. We sequenced the C. cayetanensis genomic DNA obtained from stool samples from patients infected with Cyclospora in Nepal using the Illumina MiSeq platform. By bioinformatically filtering out the metagenomic reads of non-coccidian origin sequences and concentrating the reads by targeted alignment, we were able to obtain contigs containing Eimeria-like mitochondrial, apicoplastic and some chromosomal genomic fragments. A mitochondrial genomic sequence was assembled and confirmed by cloning and sequencing targeted PCR products amplified from Cyclospora DNA using primers based on our draft assembly sequence. The results show that the C. cayetanensis mitochondrion genome is 6274 bp in length, with 33% GC content, and likely exists in concatemeric arrays as in Eimeria mitochondrial genomes. Phylogenetic analysis of the C. cayetanensis mitochondrial genome places this organism in a tight cluster with Eimeria species. The mitochondrial genome of C. cayetanensis contains three protein coding genes, cytochrome (cytb), cytochrome C oxidase subunit 1 (cox1), and cytochrome C oxidase subunit 3 (cox3), in addition to 14 large subunit (LSU) and nine small subunit (SSU) fragmented rRNA genes

    Phylogenomic Analysis of Salmonella enterica subsp. enterica Serovar Bovismorbificans from Clinical and Food Samples Using Whole Genome Wide Core Genes and kmer Binning Methods to Identify Two Distinct Polyphyletic Genome Pathotypes

    Full text link
    Salmonella enterica subsp. enterica serovar Bovismorbificans has caused multiple outbreaks involving the consumption of produce, hummus, and processed meat products worldwide. To elucidate the intra-serovar genomic structure of S. Bovismorbificans, a core-genome analysis with 2690 loci (based on 150 complete genomes representing Salmonella enterica serovars developed as part of this study) and a k-mer-binning based strategy were carried out on 95 whole genome sequencing (WGS) assemblies from Swiss, Canadian, and USA collections of S. Bovismorbificans strains from foodborne infections. Data mining of a digital DNA tiling array of legacy SARA and SARB strains was conducted to identify near-neighbors of S. Bovismorbificans. The core genome analysis and the k-mer-binning methods identified two polyphyletic clusters, each with emerging evolutionary properties. Four STs (2640, 142, 1499, and 377), which constituted the majority of the publicly available WGS datasets from >260 strains analyzed by k-mer-binning based strategy, contained a conserved core genome backbone with a different evolutionary lineage as compared to strains comprising the other cluster (ST150). In addition, the assortment of genotypic features contributing to pathogenesis and persistence, such as antimicrobial resistance, prophage, plasmid, and virulence factor genes, were assessed to understand the emerging characteristics of this serovar that are relevant clinically and for food safety concerns. The phylogenomic profiling of polyphyletic S. Bovismorbificans in this study corresponds to intra-serovar variations observed in S. Napoli and S. Newport serovars using similar high-resolution genomic profiling approaches and contributes to the understanding of the evolution and sequence divergence of foodborne Salmonellae. These intra-serovar differences may have to be thoroughly understood for the accurate classification of foodborne Salmonella strains needed for the uniform development of future food safety mitigation strategies

    Characterization of Cronobacter sakazakii Strains Originating from Plant-Origin Foods Using Comparative Genomic Analyses and Zebrafish Infectivity Studies

    Full text link
    Cronobacter sakazakii continues to be isolated from ready-to-eat fresh and frozen produce, flours, dairy powders, cereals, nuts, and spices, in addition to the conventional sources of powdered infant formulae (PIF) and PIF production environments. To understand the sequence diversity, phylogenetic relationship, and virulence of C. sakazakii originating from plant-origin foods, comparative molecular and genomic analyses, and zebrafish infection (ZI) studies were applied to 88 strains. Whole genome sequences of the strains were generated for detailed bioinformatic analysis. PCR analysis showed that all strains possessed a pESA3-like virulence plasmid similar to reference C. sakazakii clinical strain BAA-894. Core genome analysis confirmed a shared genomic backbone with other C. sakazakii strains from food, clinical and environmental strains. Emerging nucleotide diversity in these plant-origin strains was highlighted using single nucleotide polymorphic alleles in 2000 core genes. DNA hybridization analyses using a pan-genomic microarray showed that these strains clustered according to sequence types (STs) identified by multi-locus sequence typing (MLST). PHASTER analysis identified 185 intact prophage gene clusters encompassing 22 different prophages, including three intact Cronobacter prophages: ENT47670, ENT39118, and phiES15. AMRFinderPlus analysis identified the CSA family class C ÎČ-lactamase gene in all strains and a plasmid-borne mcr-9.1 gene was identified in three strains. ZI studies showed that some plant-origin C. sakazakii display virulence comparable to clinical strains. Finding virulent plant-origin C. sakazakii possessing significant genomic features of clinically relevant STs suggests that these foods can serve as potential transmission vehicles and supports widening the scope of continued surveillance for this important foodborne pathogen

    Comparative Genomic Characterization of the Highly Persistent and Potentially Virulent Cronobacter sakazakii ST83, CC65 Strain H322 and Other ST83 Strains

    Get PDF
    Cronobacter (C.) sakazakii is an opportunistic pathogen and has been associated with serious infections with high mortality rates predominantly in pre-term, low-birth weight and/or immune compromised neonates and infants. Infections have been epidemiologically linked to consumption of intrinsically and extrinsically contaminated lots of reconstituted powdered infant formula (PIF), thus contamination of such products is a challenging task for the PIF producing industry. We present the draft genome of C. sakazakii H322, a highly persistent sequence type (ST) 83, clonal complex (CC) 65, serotype O:7 strain obtained from a batch of non-released contaminated PIF product. The presence of this strain in the production environment was traced back more than 4 years. Whole genome sequencing (WGS) of this strain together with four more ST83 strains (PIF production environment-associated) confirmed a high degree of sequence homology among four of the five strains. Phylogenetic analysis using microarray (MA) and WGS data showed that the ST83 strains were highly phylogenetically related and MA showed that between 5 and 38 genes differed from one another in these strains. All strains possessed the pESA3-like virulence plasmid and one strain possessed a pESA2-like plasmid. In addition, a pCS1-like plasmid was also found. In order to assess the potential in vivo pathogenicity of the ST83 strains, each strain was subjected to infection studies using the recently developed zebrafish embryo model. Our results showed a high (90–100%) zebrafish mortality rate for all of these strains, suggesting a high risk for infections and illness in neonates potentially exposed to PIF contaminated with ST83 C. sakazakii strains. In summary, virulent ST83, CC65, serotype CsakO:7 strains, though rarely found intrinsically in PIF, can persist within a PIF manufacturing facility for years and potentially pose significant quality assurance challenges to the PIF manufacturing industry

    The Rat Genome Database (RGD): developments towards a phenome database

    Get PDF
    The Rat Genome Database (RGD) (http://rgd.mcw.edu) aims to meet the needs of its community by providing genetic and genomic infrastructure while also annotating the strengths of rat research: biochemistry, nutrition, pharmacology and physiology. Here, we report on RGD's development towards creating a phenome database. Recent developments can be categorized into three groups. (i) Improved data collection and integration to match increased volume and biological scope of research. (ii) Knowledge representation augmented by the implementation of a new ontology and annotation system. (iii) The addition of quantitative trait loci data, from rat, mouse and human to our advanced comparative genomics tools, as well as the creation of new, and enhancement of existing, tools to enable users to efficiently browse and survey research data. The emphasis is on helping researchers find genes responsible for disease through the use of rat models. These improvements, combined with the genomic sequence of the rat, have led to a successful year at RGD with over two million page accesses that represent an over 4-fold increase in a year. Future plans call for increased annotation of biological information on the rat elucidated through its use as a model for human pathobiology. The continued development of toolsets will facilitate integration of these data into the context of rat genomic sequence, as well as allow comparisons of biological and genomic data with the human genomic sequence and of an increasing number of organisms

    Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

    Get PDF
    The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology

    Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

    Get PDF
    publication en ligne. Article dans revue scientifique avec comité de lecture. nationale.National audienceThe human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology
    • 

    corecore